Complexity vs. Load Factor (\(\alpha\))

Average probe cost grows with load. Different collision strategies behave differently as \(\alpha \to 1\).

Separate Chaining

Successful: \(1 + \frac{\alpha}{2}\)
Unsuccessful: \(\alpha\)

Linear Probing

Successful: \(\tfrac{1}{2}\big(1 + \tfrac{1}{1-\alpha}\big)\)
Unsuccessful: \(\tfrac{1}{2}\big(1 + \tfrac{1}{(1-\alpha)^2}\big)\)

Holds only while the load factor is kept below a policy threshold and the hash distributes well.

Adversarial keys or a poor hash can force all items into few buckets / clusters.

Separate Chaining: Bucket holds list/tree. Expected length ≈ \(\alpha\).
Open Addressing: All keys in array; probe sequence on collision.
- Linear probing: Simple, primary clustering.
- Quadratic probing: Mitigates some clustering.
- Double hashing: Better dispersion.
Robin Hood / Cuckoo / Hopscotch: Reduce variance & worst clusters.

Load Factor \(\alpha\): 0.50

🚫 No perfect hash (general case): Finite table vs huge / unknown key space ⇒ collisions.
🧩 Strategies alleviate, not eliminate: Chains / probe sequences still lengthen as \(\alpha\) rises.
📈 Divergence: Linear probing terms with \(1/(1-\alpha)\) (and squared) explode near 1.
⏱️ Degradation signal: Rising average probes or variance ⇒ approaching resize threshold.
Resize cost: One \(O(n)\) rehash → amortized still \(O(1)\).
🔚 Bottom line: Good hash + collision handling delay slowdown, but sustained performance ultimately needs more buckets ✅.

Takeaway: Monitor load; when efficiency erodes, expand buckets to keep expected \(O(1)\) operations.